Skip to main content
Version: 1.9.x

v1.9.0

Release Date

TBD, 2026


SDK

New Features

  • SDK: Support GPU parameter configuration (#1047)

  • SDK: rock datasets list now supports fast listing across cross-region OSS (#1010)

  • SDK: Add rock storage get command to download archived sandbox logs from OSS (#962)

Bug Fixes

  • SDK: Fix illegal characters in generated Harbor job names (#1031)

  • SDK: Fix OSS upload failure caused by wget -c resume not overwriting existing files (#992)


Sandbox

New Features

  • Support sandbox restart functionality (#1001)

  • Add /delete endpoint with cascade STOPPED → DELETED transition for --rm containers (#1038)

  • Introduce SandboxStateMachine for unified lifecycle state management (#988)

  • Add ops-jobs API with DB-persisted state and multi-pod safety (#1027)

  • Add parameter validation for Admin API endpoints (#985)

  • K8s Operator: Support disk quota limits (#994)

Bug Fixes

  • Fix stop reason lost after #988 FSM refactor (#1021)

  • Fix exception handling when actor not found in RayOperator.get_status() (#1062)

  • Fix exception handling when CRD not found in K8sOperator.get_status() (#1068)

  • Fix stop_time not written when start_time is absent on stop() (start-failed sandboxes) (#1020)

  • Fix start() not properly delegating to start_async(), causing missing meta store write (#1051)

  • Fix Admin SandboxTable retry on stale connection after DB restart (#987)

Refactoring

  • Meta-store: Add Redis-merge semantics for archive and alive-key field filtering (#1037)

Deployments

New Features

  • Split docker run into docker create + docker start -a for finer container lifecycle control (#1012)

  • Share docker rootfs XFS project ID with sandbox log directory quota (#1013)


Scheduler

New Features

  • Switch FileCleanupTask to find -delete with minimal path safety guards (#967)

  • Add DB-driven SandboxLogArchiveTask, replacing legacy sentinel file design (#1025)

  • Enhanced Ray log cleanup: (#1029)

    • PART 2c: Clean runtime_env_setup-* files (covers hex suffix)

    • PART 2d: Clean rotated daemon logs (raylet.N.out, gcs_server.N.err, etc.)

    • PID-aware cleanup for session_latest/logs + logs/old directory

    • Protect agent-* and other daemon files from PID probe false positives

  • Deduplicate region scheduler.tasks via base config inheritance (#1003)

Bug Fixes

  • FileCleanupTask: Fix exclude_dirs whitelist ineffective due to -depth disabling -prune, replaced with -not -path (#1072)

  • FileCleanupTask: Fix PID/TID reuse false positives in check_pid_exists, add process name verification (#1074)

  • FileCleanupTask: Use find -type d in _discover_candidates to skip daemon log files (#1025)

  • ImageCleanupTask: Split idempotent prune from docuum launch logic (#1023)

  • SandboxLogArchiveTask: Fix cross-event-loop asyncpg pool issue, dispatch DB calls to main loop (#1025)

  • Scheduler: Add 60s timeout cap on cross-loop dispatch to prevent hang (#1025)


Rocklet

New Features

  • Add per-disk usage monitoring for rootfs, log, and kata DinD (#983)

Bug Fixes

  • Fix /execute and /read_file returning 422 due to NonBlankStr regression from PR #985 (#1065)

  • Fix success and file_name not set correctly in UploadResponse (#1060)

  • Use cgroup metrics for container memory instead of psutil (fix inaccurate metrics in DinD) (#1017)


Harbor (Agent Job)

New Features

  • Add tracking support to Harbor environment config, job config, and api_key field (#999)

CI/Testing

  • CI: Run admin+network tests only on push, skip on PRs (#1040)

  • Fix docker disk-limit test cases and cross-platform CI compatibility (#967)


Documentation

  • Add v1.8.x sandbox concurrent-creation benchmark report and scheduler user guide (#1035)

  • Refresh README with v1.4.0 – v1.8.0 release lineup and fix release notes links (#1034)